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Figure 2A: Multiple alignment of the CBH1 homologous sequences. 



T reesei mat (1 

H oriental is mat (1 

H schweinitzii mat (1 

T. koninlangbra mat (1 

T. pseudokoningii mat (1 

Consensus (1 
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H schweinitzii mat (51 
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Consensus (51 
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H orientalis mat (101 

H schweinitzii mat (101 
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Consensus (101 
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H orientalis mat (151 

H schweinitzii mat (151 

T. koninlangbra mat (151 
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Consensus (151 



T reesei mat (201 

H orientalis mat (201 

H schweinitzii mat (201 
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Consensus (201 



T reesei mat (251 

H orientalis mat (251 

H schweinitzii mat (251 

T. koninlangbra mat (251 

T. pseudokoningii mat (251 

Consensus (2 51 



T reesei mat (301 

H orientalis mat (301 

H schweinitzii mat (301 

T. koninlangbra mat (3 01 

T. pseudokoningii mat (301 

Consensus (301 



1 50 
Q S ACT LrQS ETH P PLTWQKC S S GGTCTQQTGS W I DANWRWTHATNS S TNC 
QSACTLQTETHPSLTWQKCSSGGTCTQQTGSWIDANWRWTHATNSSTNC 
QSACTLQTETHPSLTWQKCSSGGTCTQQTGSW I DANWRWTHATNS S TNC 
QS ACT I Q AETH P PLTWQKCS SGGS CTSQTG SW I DANWRWTHATNSTTNC 
QSACTIxQTETH P PLTWQKCS SGGTCTQQTGSW I DANWRWTHATNS STNC 
Q S ACTLQTETH P PLTWQ KCS S GGTCTQQTGS W I DANWRWTHATN S S TNC 

51 100 
YDGNTWS STLCPDNETCAKNCCLDGAAYASTYGVTTSGNSLS I GFVTQS A 
YDGNTWS STLCPDNETCAKNCCLDGAAYASTYGVTTS ADSLS I GFVTQS A 
YDGNTWSSTLCPDNETCAKNCCLDGAAYASTYGVTTSADSLSIGFVTQSA 
YDGNTWS SS LCPDNES CAKNCCLDGAAYASTYGVTTS ADS LS I GFVTQSQ 
YDGNTWS S TL C PDNETC AKNCCLDG AAYAS TYGVTTS AD S L S I G FVTQ S A 
YDGNTWS STLCPDNETCAKNCCLDGAAYASTYGVTTSADS LSI GFVTQS A 

101 150 
QKNVGARLYLMASDTTYQEFTLLGNEFSFDVDVSQLPCGLNGALYFVSMD 
QKNVGARLYLMASDTTYQEFTLLGNEFSFDVDVSQLPCGLNGALYFVSMD 
QKNVGARLYLMASDTTYQEFTLLGNEFSFDVDVSQLPCGLNGALYFVSMD 
QKNVGARLYLMASDTTYQEFTLLGNEFSFDVDVSQLPCGLNGALYFVSMD 
Q KNVGARL YLMAS DTT YQE FT L LGNE F S FDVD VS QL P CGLNGAL Y FV S MD 
QKNVGARLYLMASDTTYQEFTLLGNEFSFDVDVSQLPCGLNGALYFVSMD 

151 200 
ADGG VS K Y PTNT AGAK YGTGY CD S Q C PRD L KF I NGQANVEGWE PSSNNAN 
ADGGVS KY PTNT AGAKYGTGYCDSQC PRDL KF I NGQANVEGWE PS SNN AN 
ADGGVS KY PTNT AGAK YGTGY CD SQ C PRDL KF I NGQANVEGWE P S SNN AN 
ADGGVSKYPSNTAGAKYGTGYCDSQCPRDLKF INGEANVEGWE PASNNAN 
ADGGVS KY PTNT AGAKYGTGYCDSQC PRDL KF INGE ANVEGWE PFSNNAN 
ADGGVS KY PTNT AGAKYGTGY CD S Q C PRD LKF I NGQ ANVE GWE P S S NNAN 

201 250 
TGIGGHGSCCSEMDIWEANSISEALTPHPCTTVGQEICEGDGCGGTYSDN 
TGI GGHGS CCS EMD I WEANS I S EALT PH PCTTVGQE I CDGDGCGGTYSND 
TG I GGHGS CCS EMD I WEANS I S EALT PH PCTNVGQE I CDGDGCGGTYSND 
TG I GGHGS CCS EMD I WEANS I S EALT PH P CTTVGQA I CDGDGCGGTYSDD 
TG I GGHGS CCS EMD I WEANS I S EALT PHPCTTVGQE I CDGDSCGGTYSGD 
TGI GGHG S CC S E MD I WEANS I S EALT PH PCTTVGQE I CDGDGCGGTY S D 

251 300 
R YGGTCD PDGCD WN P YRLGNT S F YGPGS S FTLD TTKKLTWTQ F E TS GA I 
RYGGTCDPDGCDWNPYRLGNTSFYGPGSSFTLDTTKKLTWTQFETSGAI 
R YGGT CD PDGCDWN P YRLGNT S F YGPGS S FTLDTTKKLTWTQ F E TS GA I 
RYGGTCDPDGCDWNPYRLGNTSXYGPGSSFTLDTTKKMTWTQFATSGAI 
R YGGTCD PDGCDWN P YRLGNT S F YG PGS S F ALDTTKKLTWTQ F E TS GA I 
RYGGTCD PDGCD WNPYRLGNTSFYGPGSSFTLDTTKKLTWTQFETSGAI 

301 350 
NR YYVQNGVTFQQPNAE LGS YSGNELNDD YCTAEEAE FGGS S F SDKGGLT 
NRYYVQNGVTYQQPNAELGSYSGNELNDDYCTAEESEFGGSSFSDKGGLT 
NRYYVQNGVTYQQPNAELGS YSGNELNDAYCTAEES E FGGS S F SDKGGLT 
NR Y YVQNGVT FQ Q PNAE LGS Y S GNT LNDA YCAAE EAE FGG S S F S DKGG LT 
NRYYVQNGVTFQQPNAELGSYSGNSLDDDYCAAEEAEFGGSSFSDKGGLT 
NRYYVQNGVTFQQPNAELGSYSGNELNDDYCTAEEAEFGGSSF SDKGGLT 
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Figure 2B: Multiple alignment of the CBH1 homologous sequences. 



T reesei mat (351) 

H orientalis mat (351) 

H schweinitzii mat (351) 

T. koninlangbra mat (351) 

T. pseudokoningii mat (351) 



351 



400 



QF KKATS GGMVLVMS LWDD YYANMLWLDS T Y PTNETS STPGAVRGS CSTS 
QFKKATSGGMVLVMSLWDDYYANMLWLDSTYPTNETSSTPGAVRGSCSTS 
QFKKATSGGMVLVMSLWDDYYANMLWLDSTYPTNETSSTPGAVRGSCSTS 
QFKQATS GGMVLVMS LWDDYYANMLWLDS I YPTNETSSTPGAARGS CSTS 
QFKKATSGGMVLVMSLWDDYYANMLWLDSTYPTNETS STPGAVRGS CSTS 



Consensus (351) QFKKATS GGMVLVMS LWDD YYANMLWLDSTYPTNETS STPGAVRGS CSTS 



T reesei mat 
H orientalis mat 
H schweinitzii mat 
T. koninlangbra mat 
T. pseudokoningii mat 
Consensus 



401 450 

(401) SGVPAQVESQSPNAKVTFSNI KFGPI GSTGNPSGGNPPGG- NPPGTTTTR 

(401) SGVPAQLESQSPNAKWYSNI KFGPI GSTGNPSGGNPPGG- NPPGTTTTR 

(401) SGVPAQLESQSANAKWYSNI KFGPI GSTGNPSGGNPPGG- NPPGTTTTR 

(401) SGVPAQLESQSTNAKWFSNI KFGP I GSTGNS SGGNP PGGGNPPGTTTTR 

(401) SGVPAQLESQSSNAKWYSN I KFGP I GSTGNSSGGSP PGGGNPPGTTTTR 

(401) SGVPAQLESQS NAKWYSN I KFGP I GSTGNPSGGNPPGG NPPGTTTTR 



T reesei mat 
H orientalis mat 
H schweinitzii mat 
T. koninlangbra mat 
T. pseudokoningii mat 
Consensus 



451 499 

(450) RPATTTGS S PG PTQSH YGQCGG I GYSGPTVCASGTTCQ - VLNP YYSQCL 

(450) RPATTTGS S PGPTQTHYGQCGG I GYSGPTVCASGTTCQ - VLNP YYSQCL 
( 4 50 ) RPATTTGS S PGPTQTHYGQCGG I GYSGPTI CASGTTCQQVLNE YYSQCL 

( 451 ) RPATTTGSS PGPTQTHYGQCGG I GYSGPTVCASGS TCQ - VLNE YYSQCL 
(451 ) RPATS TGS S PGPTQTHYGQCGG I GYSGPTV CASGS TCQ - VLNE Y YS QCL 
(451) RPATTTGS S PGPTQTHYGQCGG I GYSGPTVCASGTTCQ VLNEYYSQCL 
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CGTCATCTCG GCCTTCTTGG CCACGGCCCG TGCTCAGTCG GCCTGCACTC 50 

TCCAAACGGA GACTCACCCG TCTCTGACAT GGCAGAAATG CTCGTCTGGC 100 

GGCACTTGCA CCCAGCAGAC AGGCTCCGTG GTCATCGACG CCAACTGGCG 15 0 

CTGGACTCAC GCGACTAACA GCAGCACGAA CTGCTACGAC GGCAACACTT 2 00 

GGAGCTCAAC CCTATGCCCT GACAACGAGA CTTGCGCGAA GAATTGCTGC 2 50 

CTGGACGGTG CCGCCTATGC GTCCACGTAC GGAGTCACCA CGAGTGCCGA 3 00 

CAGCCTCTCC ATCGGCTTCG TCACGCAATC TGCACAGAAG AACGTTGGCG 3 50 

CCCGTCTCTA CCTGATGGCG AGTGACACGA CTTACCAGGA GTTCACGCTG 4 00 

CTTGGCAACG AGTTCTCTTT TGACGTTGAT GTTTCGCAGC TGCCGTAAGT 4 50 

GACAACCATT CCCCGCGAGG CCATCTTCTC ATTGGTTCCG AGCTGACCCG 5 00 

CCGATCTAAG A TGTGGCTTG AACGGCGCTC TGTACTTCGT GTCTATGGAT 550 

GCGGATGGTG GCGTGAGCAA GTATCCCACC AACACCGCCG GCGCCAAGTA 600 

CGGCACGGGC TACTGCGACA GCCAGTGCCC CCGCGATCTC AAGTTCATCA 650 

ACGGCCAGGC CAACGTTGAA GGCTGGGAGC CGTCCTCCAA CAACGCCAAC 7 00 

ACGGGTATTG GCGGACACGG AAGCTGCTGC TCTGAGATGG ATATCTGGGA 7 50 

GGCCAACTCC ATCTCCGAGG CTCTGACTCC TCACCCTTGC ACGACTGTTG 8 00 

GCCAGGAGAT CTGCGACGGT GACGGCTGCG GCGGAACCTA CTCCAACGAC 850 

CGATATGGTG GTACTTGCGA TCCTGATGGT TGTGATTGGA ATCCATACCG 900 

CTTGGGCAAC ACCAGCTTCT ATGGCCCTGG CTCGAGCTTC ACCCTCGATA 95 0 

CCACCAAGAA GTTGACCGTT GTCACCCAGT TCGAGACCTC GGGTGCCATC 1000 

AACCGTTACT ATGTCCAGAA CGGCGTCACT TACCAGCAAC CCAACGCCGA 1050 

GCTCGGTAGT TACTCTGGTA ATGAGCTCAA CGATGACTAC TGCACAGCTG 1100 

AGGAGTCGGA ATTCGGCGGC TCCTCCTTCT CGGACAAGGG CGGCCTTACT 1150 

CAGTTCAAGA AGGCCACTTC CGGCGGCATG GTCCTGGTCA TGAGCTTGTG 12 00 

GGATGAC GTG AGTTGATAGA CAGCATTCAC ATTGTCGTTG GAAAGACGGG 12 5 0 

CGGCTAACCG AGACATATGA TATCTAACAG TACTACGCCA ACATGCTGTG 13 00 

GCTGGACTCC ACCTACCCGA CAAACGAGAC CTCCTCCACC CCCGGCGCCG 13 5 0 

TGCGCGGAAG CTGCTCCACC AGCTCCGGCG TCCCCGCTCA GCTCGAGTCC 14 00 

CAGTCCCCCA ACGCCAAGGT CGTCTACTCC AACATCAAGT TCGGGCCCAT 14 5 0 

TGGCAGCACC GGCAACCCCA GCGGCGGAAA CCCTCCTGGC GGAAACCCTC 15 0 0 

CCGGCACCAC CACCACCCGC CGCCCAGCTA CCACCACTGG AAGCTCTCCC 1550 

GGACCTACTC AGACTCACTA CGGCCAGTGC GGCGGCATCG GCTACAGCGG 1600 

CCCTACGGTC TGCGCCAGCG GCACGACCTG CCAGG 163 5 



Figure 3: H. oreintalis genomic DNA sequence. 
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MYRKLAVISA FLATARA 



J 



17 



Figure 4A: H. orientalis amino acid signal sequence. 



QSACTLQTET HPSLTWQKCS SGGTCTQQTG 

YDGNTWSSTL CPDNETCAKN CCLDGAAYAS 

QKNVGARLYL MASDTTYQEF TLLGNEFSFD 

ADGGVS KYPT NTAGAKYGTG YCDSQCPRDL 

TGIGGHGSCC SEMDI WEANS ISEALTPHPC 

RYGGTCDPDG CDWNPYRLGN TSFYGPGSSF 

NRYYVQNGVT YQQPNAELGS YSGNELNDDY 

QFKKATSGGM VLVMSLWDDY YANMLWLDST 

SGVPAQLESQ S PNAKWYSN IKFGPIGSTG 

PATTTGSSPG PTQTHYGQCG GIGYSGPTVC 



SWIDANWRW THATNSSTNC 50 

TYGVTTSADS LSIGFVTQSA 100 

VDVSQLPCGL NGALYFVSMD 15 0 

KF I NGQANVE GWEPSSNNAN 2 00 

TTVGQEICDG DGCGGTYSND 2 50 

TLDTTKKLTV VTQFETSGAI 3 00 

CTAEESEFGG SSFSDKGGLT 3 50 

YPTNETSSTP GAVRGSCSTS 4 00 

NPSGGNPPGG NPPGTTTTRR 4 50 

ASGTTCQVLN PYYSQCL 4 97 



Figure 4B: H. orientalis mature amino acid sequence. 
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TCGGCCTGCA 


CTCTCCAAAC 


GGAGACTC AC 


CCGTCTCTGA 


CATGGCAGAA 


50 


ATGCTCGTCT 


GGCGGCACTT 


GCACCCAGCA 


GACAGGCTCC 


GTGGTCATCG 


100 


ACGCCAACTG 


GCGCTGGACT 


CACGCTACTA 


ACAGCAGCAC 


GAACTGCTAC 


150 


GACGGCAACA 


CTTGGAGCTC 


AACCCTGTGC 


CCTGACAATG 


AGACTTGCGC 


200 


GAAGAACTGC 


TGCCTGGACG 


GTGCCGCCTA 


TGCGTCCACG 


TACGGAGTCA 


250 


CCACGAGTGC 


CGACAGCCTC 


TCCATCGGCT 


TCGTGACACA 


GTCTGCACAG 


300 


AAAAACGTTG 


GCGCCCGTCT 


CTAC CTGATG 


GCGAGTGACA 


CGACTTACCA 


350 


GGAGTTCACG 


CTGCTTGGCA 


ACGAGTTCTC 


ATTCGACGTT 


GATGTTTCGC 


400 


AGCTGCCGTA 


AGTGACAACC 


ATTCCCCCGA 


CGCCATCTTC 


TCATTGGTTC 


450 


GAAGCTGACC 


CGCCGATCTA 


AGATGTGGCT 


TGAACGGCGC 


TCTTTACTTC 


500 


GTGTCCATGG 


ACGCAGATGG 


TGGCGTGAGC 


AAGTATCCCA 


CCAACACCGC 


550 


CGGCGCCAAG 


TACGGCACGG 


GCTACTGTGA 


CAGCCAGTGC 


CCCCGCGATC 


600 


TCAAGTTTAT 


CAACGGC CAG 


GCCAACGTTG 


AAGGCTGGGA 


GCCGTCCTCC 


650 


AACAACGCCA 


ACACGGGTAT 


TGGCGGACAC 


GGAAGCTGCT 


GCTCCGAGAT 


700 


GGATATCTGG 


GAGGCCAACT 


CCATCTCCGA 


GGCTCTTACT 


CCTCACCCTT 


750 


GCACGAATGT 


TGGCCAGGAG 


ATCTGCGACG 


GTGACGGCTG 


CGGCGGAACC 


800 


TACTCCAACG 


AC CGATATGG 


TGGTACTTGC 


GATC CTGATG 


GTTGTGATTG 


850 


GAATCCATAC 


CGCTTGGGCA 


AC AC C AGCTT 


CTATGGCCCT 


GGCTCGAGCT 


900 


TCACCCTCGA 


TACCACCAAG 


AAGTTGACCG 


TCGTCACCCA 


GTTCGAGACT 


950 


TCGGGTGCCA 


TCAACCGTTA 


CTATGTCCAG 


AATGGCGTCA 


CTTACCAGCA 


1000 


ACCCAACGCC 


GAGCTCGGCA 


GTTACTCTGG 


TAATGAGCTC 


AACGATGCCT 


1050 


ACTGCACAGC 


TGAAGAGTCG 


GAATTTGGCG 


GTTCCTCCTT 


CTCGGACAAG 


1100 


GGCGGCCTTA 


CTCAGTTCAA 


GAAGGC C ACT 


TCCGGCGGCA 


TGGTCCTGGT 


1150 


CATGAGCTTG 


TGGGATGACG 


TGAGTCCATA 


GAACAGCATT 


CACATTGTCG 


1200 


TCGGAAAGAC 


GGCCGGCTAA 


C CG AG AC ATT 


ACAGTACTAC 


GCCAACATGC 


1250 


TGTGGCTGGA 


CTCCACCTAC 


CCGACAAACG 


AGACCTCCTC 


CACCCCCGGT 


1300 


GCCGTGCGCG 


GAAGCTGCTC 


CACCAGCTCC 


GGCGTCCCAG 


CTCAGCTCGA 


1350 


GTCCCAGTCC 


GCCAACGCCA 


AGGTCGTCTA 


CTCCAACATC 


AAGTTCGGAC 


1400 


CCATTGGCAG 


CACCGGCAAC 


CCCAGCGGCG 


GAAACCCTCC 


TGGCGGAAAC 


1450 


CCTCCCGGCA 


CCACCACCAC 


CCGCCGCCCA 


GCTACCACCA 


CTGGAAGCTC 


1500 


TCCCGGACCT 


ACTCAGACTC 


ACTATGGCCA 


GTGCGGCGGC 


ATCGGCTACA 


1550 


GCGGCCCTAC 


GATCTGCGCC 


AGCGGCACGA 


CCTGCCAGG 




1589 



Figure 5: H. scweinitzii genomic DNA sequence. 
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MYRKLAVITA FLATARA 17 

Figure 6A: H. Schweinitzii signal peptide. 

QSACTLQTET HPSLTWQKCS SGGTCTQQTG SWIDANWRW THATNS STNC 5 0 

YDGNTWS STL CPDNETCAKN CCLDGAAYAS TYGVTTSADS LSIGFVTQSA 100 

QKNVGARLYL MASDTTYQEF TLLGNEFSFD VDVSQLPCGL NGALYFVSMD 15 0 

ADGGVS KYPT NTAGAKYGTG YCDSQCPRDL KFINGQANVE GWEPSSNNAN 20 0 

TGIGGHGSCC SEMDIWEANS ISEALTPHPC TNVGQEICDG DGCGGTYSND 2 50 

RYGGTCDPDG CDWNPYRLGN TSFYGPGSSF TLDTTKKLTV VTQFETSGAI 3 00 

NRYYVQNGVT YQQPNAELGS YSGNELNDAY CTAEESEFGG SSFSDKGGLT 3 50 

QFKKATSGGM VLVMSLWDDY YANMLWLDST YPTNETSSTP GAVRGSCSTS 40 0 

SGVPAQLESQ SANAKWYSN IKFGPIGSTG NPSGGNPPGG NP PGTTTTRR 45 0 

PATTTGSSPG PTQTHYGQCG GIGYSGPTIC ASGTTCQVLN PYYSQCL 4 97 



Figure 6B: H. Schweinitzii mature amino acid sequence. 497 residues 
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TCGGCCTGCA 


CCATTCAAGC 


GGAGACTCAC 


CCGCCTCTGA 


CATGGCAGAA 


50 


ATGCTCATCC 


GGTGGTAGTT 


GCACCTCGCA 


AACCGGTTCT 


GTGGTGATTG 


100 


ACGCGAACTG 


GCGATGGACT 


C ACG CG ACT A 


ACAGCACCAC 


GAACTGCTAC 


150 


GACGGTAACA 


CTTGGAGCTC 


CAGTCTTTGC 


CCCGACAATG 


AGAGTTGCGC 


200 


AAAGAACTGC 


TGCCTGGACG 


GTGCAGCCTA 


CGCATCCACG 


TACGGAGTCA 


250 


CCACGAGTGC 


TGATAGCCTC 


TCCATTGGCT 


TCGTCACTCA 


GTCTCAGCAG 


300 


AAGAATGTTG 


GCGCTCGTCT 


CTACCTGATG 


GCAAGCGACA 


CGACCTACCA 


350 


GGAATTTACC 


CTGCTTGGCA 


ACGAGTTCTC 


TTTCGATGTT 


GATGTTTCAC 


400 


AGCTGCCGTA 


AGTGACTAGC 


ATTTACCTCC 


GACGCCATCT 


CATTGATTCC 


450 


CAGCTGACGG 


CCAATTCAAG 


ATGTGGCTTG 


AACGGAGCCC 


TTTACTTCGT 


500 


GTCCATGGAC 


GCGGATGGTG 


GCGTGAGCAA 


GTATCCCTCC 


AACACTGCCG 


550 


GCGCCAAGTA 


CGGCACGGGC 


TACTGCGATA 


GCCAGTGTCC 


CCGTGATTTG 


600 


AAGTTCATCA 


ACGGCGAGGC 


CAACGTTGAG 


GGCTGGGAGC 


CGGCTTCGAA 


650 


CAACGCCAAC 


ACGGGTATTG 


GCGGACACGG 


AAGCTGCTGC 


TCTGAGATGG 


700 


ATATCTGGGA 


GGCCAACTCC 


ATCTCTGAGG 


CCCTTACTCC 


TCACCCTTGC 


750 


ACGACTGTCG 


GCCAGGCCAT 


TTGCGATGGT 


GACGGCTGCG 


GTGGAACCTA 


800 


CTCCGATGAC 


CGATATGGTG 


GTACTTGCGA 


TCCTGATGGC 


TGTGACTGGA 


850 


ACCCATACCG 


CTTGGGCAAC 


ACCAGCTTCT 


ACGGCCCCGG 


CTCGAGCTTC 


900 


ACCCTCGACA 


CCACCAAGAA 


GATGACCGTC 


GTCACCCAGT 


TCGCTACTTC 


950 


GGGTGCCATC 


AACCGATACT 


ATGTCC AGAA 


TGGCGTCACT 


TTCCAGCAGC 


1000 


CCAACGCCGA 


GCTCGGTAGC 


TACTCTGGCA 


ACACGCTCAA 


CGATGCTTAC 


1050 


TGCGCAGCTG 


AAGAGGCGGA 


ATTCGGCGGA 


TCATCTTTCT 


CAGACAAGGG 


1100 


TGGCCTTACC 


CAATTCAAGC 


AGGCTACTTC 


AGGCGGCATG 


GTCTTGGTTA 


1150 


TGAGCCTGTG 


GGATGACGTG 


AGTTCATGGA 


TAGCATTGAC 


ATTGTCGAGA 


1200 


GAACCATAGC 


CGCTGACCGA 


GACACAACAG 


TACTACGCCA 


ACATGCTGTG 


1250 


GCTGGACTCC 


ATCTACCCGA 


CGAACGAGAC 


CTCCTCTACC 


CCCGGTGCCG 


1300 


CGCGCGGAAG 


CTGCTCTACC 


AGCTCCGGTG 


TCCCTGCCCA 


GCTCGAGTCT 


1350 


CAGTCTACCA 


ACGCCAAGGT 


CGTCTTCTCC 


AACATCAAGT 


TCGGACCCAT 


1400 


TGGCAGCACT 


GGTAACTCCA 


GCGGCGGAAA 


CCCCCCGGGC 


GGAGGAAACC 


1450 


CCCCCGGCAC 


CACCACCACC 


CGACGCCCAG 


CTACCACCAC 


CGGAAGCTCT 


1500 


CCCGGACCTA 


CTCAGACACA 


CTATGGCCAG 


TGCGGTGGAA 


TTGGGTACTC 


1550 


GGGCCCCACG 


GTCTGCGCCA 


GCGGCAGCAC 


ATGCCAGG 




1588 



Figure 7: T. konilangbra genomic DNA. 
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M YRKLAV I T A FLATARA 17 

Figure 8 A: T. konilangbra signal sequence. 

QSACTIQAET HPPLTWQKCS SGGSCTSQTG SWIDANWRW THATNSTTNC 5 0 

YDGNTWSSSL CPDNESCAKN CCLDGAAYAS T YGVTT SAD S LSIGFVTQSQ 10 0 

QKNVGARLYL MASDTTYQEF TLLGNEFSFD VDVSQLPCGL NGALYFVSMD 15 0 

ADGGVSKYPS NTAGAKYGTG YCDSQCPRDL KFINGEANVE GWEPASNNAN 2 00 

TGIGGHGSCC SEMDI WEANS ISEALTPHPC TTVGQAICDG DGCGGTYSDD 250 

RYGGTCDPDG CDWNPYRLGN TSXYGPGSSF TLDTTKKMTV VTQFATSGAI 3 00 

NRYYVQNGVT FQQPNAELGS YSGNTLNDAY CAAEEAEFGG SSFSDKGGLT 350 

QFKQATSGGM VLVMSLWDDY YANMLWLDSI YPTNETSSTP GAARGSCSTS 4 00 

SGVPAQLESQ STNAKWFSN IKFGPIGSTG NSSGGNPPGG GNP PGTTTTR 450 

RPATTTGSSP GPTQTHYGQC GGIGYSGPTV CASGSTCQVL NPYYSQCL 4 98 



Figure 8B: T. konilangbra mature amino acid sequence. 
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TCGGCCTGCA CCCTCCAGAC GGAAACTCAC CCGCCTCTGA CATGGCAGAA 5 0 

ATGCTCATCT GGTGGCACTT GCACCCAACA GACGGGCTCC GTGGTCATCG 100 

ACGCGAACTG GCGCTGGACT CACGCTACGA ACAGCAGCAC GAACTGCTAC 150 

GACGGTAACA CTTGGAGCTC AACCTTGTGC CCTGACAATG AGACTTGCGC 200 

GAAGAACTGC TGCTTGGATG GTGCCGCCTA CGCGTCGACG TACGGAGTCA 250 

CCACGAGCGC TGACAGCCTC TCCATTGGCT TCGTCACTCA GTCTGCGCAG 3 00 

AAGAATGTCG GCGCCCGTCT CTACTTGATG GCGAGTGACA CGACCTACCA 350 

AGAATTTACC CTGCTTGGCA ACGAGTTCTC CTTCGATGTT GATGTTTCCC 400 

AGCTGCC GTA AGTGGCCAAC TACACCCCTT GACGGTATCC TCTCATTGGT 450 

TCCCAGCTGA CTCGCGAAAT TAAGA TGTGG CTTGAACGGA GCTCTTTACT 500 

TCGTGTCCAT GGACGCGGAT GGTGGCGTGA GCAAGTATCC CACAAACACT 55 0 

GCCGGCGCCA AGTACGGCAC GGGTTACTGT GACAGC CAGT GCCCTCGTGA 600 

TCTCAAGTTC ATCAACGGCG AGGCCAACGT TGAGGGCTGG GAGCCGTTCT 650 

CCAACAACGC CAACACGGGC ATTGG CGG AC ATGGAAGCTG CTGCTCTGAG 7 00 

ATGGATATCT GGGAGGCCAA CTCCATCTCT GAGGCTCTTA CTCCTCATCC 75 0 

TTGCACGACC GTCGGGCAGG AAATTTGCGA TGGTGACTCC TGCGGCGGAA 8 00 

CCTACTCCGG TGATCGATAT GGCGGTACTT GCGATCCTGA TGGCTGCGAT 85 0 

TGGAACCCAT ACCGCTTGGG CAACACCAGC TTCTACGGGC CCGGCTCAAG 900 

CTTCGCTCTT GATACCACCA AGAAGTTGAC CGTTGTCACC CAGTTCGAGA 95 0 

CTTCGGGCGC TATCAACCGG TACTACGTCC AGAATGGCGT CACTTTCCAG 100 0 

CAGCCCAACG CCGAGCTCGG TAGTTACTCT GGCAACTCGC TCGACGATGA 105 0 

CTACTGCGCG GCTGAAGAGG CGGAGTTTGG TGGCTCTTCT TTCTCGGACA 1100 

AGGGCGGCCT TACTCAATTC AAAAAGGCTA CTTCCGGTGG CATGGTCTTG 1150 

GTCATGAGCT TGTGGGATGA T GTGAGTTCA TGAATAGCAT TCAAACAGTC 12 00 

AACAGAATAA CAGCAGCTGA CTGAGACACA ATAG TACTAC GCCAACATGC 12 5 0 

TGTGGCTGGA CTCCACCTAC CCGACGAACG AGACCTCTTC CACCCCCGGT 13 00 

GCCGTGCGCG GAAGCTGCTC CACCAGCTCC GGTGTCCCTG CTCAGCTTGA 13 5 0 

GTCCCAGTCT TCCAACGCCA AGGTCGTATA CTCCAACATC AAGTTCGGCC 14 0 0 

CTATCGGCAG CACCGGCAAC TCCAGCGGCG GTAGCCCTCC CGGCGGAGGA 14 5 0 

AACCCTCCCG GTACCACGAC CACCCGCCGC CCAGCTACCT CCACTGGAAG 15 0 0 

CTCTCCCGGC CCTACTCAGA CGCACTATGG CCAGTGCGGT GGTATTGGGT 15 5 0 

ACTCGGGCCC CACGGTCTGC GCGAGTGGCA GCACTTGCCA GG 15 92S 



Figure 9: T. pseudokonigii genomic DNA sequence. 
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Figure 10A: T. pseudokoningii signal sequence. 



QSACTLQTET HPPLTWQKCS SGGTCTQQTG SWIDANWRW THATNSSTNC 50 

YDGNTWS STL CPDNETCAKN CCLDGAAYAS TYGVTTSADS LSIGFVTQSA 100 

QKNVGARLYL MASDTTYQEF TLLGNEFSFD VDVSQLPCGL NGALYFVSMD 150 

ADGGVS KYPT NTAGAKYGTG YCDSQCPRDL KFINGEANVE GWEPFSNNAN 2 00 

TGIGGHGSCC SEMDIWEAJSTS ISEALTPHPC TTVGQEICDG DSCGGTYSGD 2 50 

RYGGTCDPDG CDWNPYRLGN TSFYGPGSSF ALDTTKKLTV VTQFETSGAI 3 00 

NRYYVQNGVT FQQPNAELGS YSGNSLDDDY CAAEEAEFGG SSFSDKGGLT 350 

QFKKATSGGM VLVMSLWDDY YANMLWLDST YPTNETSSTP GAVRGSCSTS 400 

SGVPAQLESQ SSNAKWYSN IKFGPIGSTG NSSGGSPPGG GNPPGTTTTR 450 

RPATSTGSSP GPTQTHYGQC GGIGYSGPTV CASGSTCQVL NPYYSQCL 4 98 



Figure 10B: T. pseudokoningii mature amino acid sequence. 
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Figure llrpRAXl 
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Figure 12: Destination vector pRAXdes2 for expression in A. niger 



gla promoter 
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Figure 13: Replicative expression pRAXdesCBHl vector of CBH1 genes under the control of 
the glucoamylase promotor. 




